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1 . 

~ In advanced robot control problems, on-line computation of Inverse Jacobian solution I 

Is frequently required. Parallel processing architecture Is an effective way to reduce j 

computation time, i n t hi s - -paper, j. parallel processing architecture Is developed for the / 

Inverse Jacobian (Inverse differed!*] al kinematic equation) of PUMA arm,£4£. The proposed / 

pipeline/parallel algorithm can be Implemented on IC chip using systolic linear arrays. / 

This Implementation requires 27 processing cells and 25 time units. Computation time Is 

thus significantly reduced. J 

2. Introduction 

In many advanced robot control problems, such as with sensor guided manipulations. It Is essential that the 
end effector be appropriately controlled In Cartesian coordinates so that the robot can adapt to a changing 
environment. This means that we need to compute the Inverse Jacobian In real time to provide the required 
differential change In joint variables for a desired differential change In position and orientation. The speed 
of this computation directly affects the speed of robot operation. Thus efficient algorithms for computing the 
Inverse Jacobian are needed. 


There have been efforts made recently In developing computationally efficient algorthlmc to solve the 
Jacobian problem suitable for serial computer Implementation [1,2]. In addition some work has been reported In 
algorithm development for implementation on pipelined or parallel computer [3]. These results show that such 
parallel algorithms can reduce computation time significantly. 

A more Important requirement in robot manipulation Is the computing of the Inverse Jacobian solution. This 
is generally a troublesome problem when we try to Invert the Jacobian numerically. A more direct approach Is to 
derive an explicit solution of the Inverse Jacobian for a given robot. Paul, Shlmano, and Mayer [2] have shown 
that such solutions can be obtained by differentiating the kinematic equations. This approach has shown to 
result simpler Inverse Jacobian solutions with regard to manipulator degeneracies and joint constraints. The 
inverse Jacobian of the PUMA arm has been solved specifically In [2]. 

In this paper, we present a pi pel 1 ne/paral lei algorithm and architecture for computing the PUMA arm Inverse 

Jacobian derived In [2]. With rapid advances in VLSI technology, this type of algorithm can be readily 

Implemented on IC chips. These special purpose chips can be connected to a host computer system to achieve 

real-time Cartesian space control at sufficiently high sample rate. It Is noted that a study has bet.i made 
recently to Implement direct kinematic solution on VLSI chips to speed up computation time [4]. The goal here 

Is to further exploit the advantages of VLSI technology for the aeslgn of customized chips dedicated to the 

computing of the Inverse Jocoblan of PUMA arm. 

3. Differential Kinematic Solution of PUMA Arm 

Differential changes In joint variables dqi can be related to the different changes in translation and 
rotation dx, dy, dz, fi x , fiy, and fi 2 of the end effector by the relationship 

[d x * dy* dz* fi x , fiy, 67,] * J [dqi, dq2,...»dq n ] ( 1 ) 

In which n Is the number of joints, and J Is the Jacobian matrix. But in advanced robot control problems, we 

need the solution of dq-j given the desired differential change d x , dy, d z , fi x , fiy, fi 2 . That is we need to 

compute the Inverse problem 

[dqi, dq 2 ,...,dq n ] T - j’ 1 [d x , d y , d 2 , fix, fiy, fiz] T (2) 

This represents the Inverse differential kinematic solution (Inverse Jacobian) of the robot arm. 

Instead of relying on the direct computing of the Inverse Jacobian matrix J~* , an analytical solution of 
the Inverse Jacobian problem can be frequently obtained, and such a solution for the PUMA arm is given in [2]. 
For the PUMA arm, the joint variables are the six rotational joint angles e^, ©^....©g. Furthermore, the 
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differential changes In translation and rotation can be related to the differential change of the end effector 
honogeons Matrix T [2]: 


dT - T 


(3) 


where 


dT 


T - 

n o a p 
0 0 0 1 


dn x 

do x 

<u x 

dPx" 

dhy 

do y 

day 

dpy 

dn 2 

do z 

da z 

dpz 

0 

0 

0 

0 

0 

-«z 

«y 

dx" 


0 

•fix 

dy 

-6y 

«x 

0 

dz 

0 

0 

0 

0 


AT = 


Therefore differential changes In translation and rotation can also be specified' In terms of the x, y, z 
elements of dp, do, and da (dn vector Is redundant). The desired solutions of doi In terms of dp, do, and da 
for the PUMA arm obtained In [2] are given In the appendix. A plpellne/parallel processing architecture fcr 
computing these equations Is now developed below. 

4. Systolic Array Processing 

VLSI technology has created a new architecture horizon In Implementing parallel algorithms directly on 
hardware. Central to this architecture Is the use of systolic linear arrays which consist of Inter- 
connected simple and mostly Identical processing cells. Algorithms that can be executed using Identical 
operations simultaneously can take advantage of the systolic array architecture to reduce computation time. 

The processing cell ftructure we will employ Is the “Inner product step processor* which performs 
matrix-vector multiplication using one-way pipeline algorithms. For example, computing 

Ab * p 

where A Is nxm and b Is mxl , can be carried out In the following recurrence manner: 

pi°) * 0 


( 4 ) 




p(*0+ a k 

*1 a 1k D k 


l,m, 1 ■ l,n 




p(») 


This operation can be Implemented by a linear array of m Inner product step processors shown In Figure l. 

In the following section, we will reformulate the Inverse differential kinematic equation given In the 
appendix In terms of a set of matrix-vector multiplications which can be computed In parallel and pipelining 
fashion. 

5. Algorithm Development 

In this section, we present the matrix-vector multiplication processing schemes for computing the 
dl fferentlals doi, 1 - 1,2,.. .,6. Here we assime that the trigonometric functions required are available. 
Typically these functions can' be generated by employing ROM look-up techniques [5,6]. The algorithm Is broken 
down Into 15 steps as described below. The notation $ S1ne>. * CosOj are used. 

(1) - Pl 

dpy dp x p x py ay a x da x -a x day Oy o x do x doy 

-dp x dpy py -p X -a X 3y d3y “fly -dfly — 0 X Oy dOy -dO X 
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b i ■ l c i s \] p i * ( p ir # " p i9 p no** * p ii3 J 
output: dOj • Pxx/Pl3 

(2) f x mj ♦ g 2 ■ hj 

f l ■ l p 14 p 15 p 18 P 110 ’ p Ull m i ' do i 
®1 * t p 12 P 17 p 14 p 112 P 113 1 



"l" 

[ h ll* 

... 

h i 5 l 
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output: dOj * P3i/ h 23 

(6) A 4 b 4 - p 4 



P5 ' (P 51 P S2 1 



b 5 * t dp z h ll d0 3 1 


(8) Agb6 ■ P6 

a T m p 42 * a z p 16 “ p 51 b 12 da z p lll ~°z h 14 ‘ do z 

6 - p 41 - p 16 “t - p 52 - da t h 12 -°/ * P 1U " d °z ' h 14 


outputs: d023 a P64/P51 d0 ? " d0 23 " d0 3 
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(9) f 3 m 3 

'S' 

«3 ’ 


(10) A ; b ? 


+ g 3 - h 3 

[ p 3 p 63 p 68 * p 67 1 
t p 65 p 66 p 69 P 61ol 

( h 31* * * * h 34 1 
‘ p 7 

P 15 p 63 
p 32 * h 13 


m 3 * d0 23 


p 15 

p 63 


output: do 4 ■ p 7 g/Pn 

(11) A 8 b 8 . p 8 


f p 15 P 31 p 67 

A 8 ■ 

[_* p 63 h 13 P 1 16 
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(12) f 4 m 4 + g 4 ■ h 4 


p U0 h 33 " p 67 h l5 ' 
* p 67 h 15 * P U0 ’ h 33. 

p 8 * t p 81 * * * * p 87 1 
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m 4 • do 4 

h 4 " ( h 4 i h 42 b 4 3 1 

p 9l" 
p 9 * P 92 

_ p 93_ 

[P93] m 5 * d0 5 h 5M h 5l] 
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output: dOg * Pjoi 


The data flow timing table for these computations are given in Tables 1 and 2. It Is shewn that the 
solution requires 25 time units and 27 processing cells. 

The results of 25 time units is a significant reduction of computation time in comparison with that when a 
serial computer Is used to compute the original solution. The total number of multiplications of that solution 
is about 150. This is equivalent to 150 time units In the systole array processing system as opposed to the 25 
% time units we have achieved by exploiting parallelism. 
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Tabl« 1. Oata Mow timing table for steps 1 through 8 which compute doi 863 and d03. Nunbers 
on top row indicate time units. 
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Tabic 2 . Oata flow timing table for steps 9 through IS which compute dee, d 0 5 . de$ 


10 11 12 13 14 IS 16 17 18 19 20 21 22 23 24 2S 


p 64 p 65 

P63 P66 

P68 p 69 

- p 67 P610 

PlS P63 Pi 5 

h3i -hj3 P63 

p 15 -P63 Ce 

h3i hj3 S 4 

P67 Pi 10 


PllO 'P67 

h 33 *15 

* p 67 "PllO 

h^ -h33 


P 8 i d0 4 • p 82 

p 36 p 85 

p 86 p 87 

*>41 * h 32 C 5 

-pGG P83 S 5 

-h42 -h34 

P 92 d0 5 * P93 

* h 43 C6. S 6 


6, Conclusion 

It has been demonstrated In this paper that parallel computing architecture can be developed for the 
Inverse differential kinematic equation of the PUMA arm. By using systolic linear arrays employed in VLSI chip 
design, the computation can be completed with 27 processing cells in 25 time units. 

The differential kinematic equation in Its original form requires about 150 multiplications to compute. If 
one multiplication Is counted as one time unit, the parallel architecture definitely provides a substantial 
reduction in computation time. A customized 1C chip dedicated to this algorithm can be fabricated. 
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Appendix 


de, 


d0n 


: Inverse Differential Kinematic Equations for PUHA arm 

C,dP y - SjdP x 

■ c i<*v;' 

d i 

*2 (d 4 c 3 " a 3 5 3^ 

dj • f jj ( p)df jj ( p) * Pj 2 (p)df j 2 (p) 
flj(p) ■ CjP,, • 

dfjj(p) ■ -^P x dei * Cjdp x ♦ CjPydej ♦ Sjdp y 
f 12 (p) - -dp z 


de 23 


-S 23 d « 2 - C 23 d ^ 1 
C 23 v 2 * S 23 u l 


V 1 * “ w 2*ll^ * w i p z 
v 2 " u l f ll( p > + 2 

* d 3 ^ 

* il 2 " d 4 * *2^3 

dvj ■ -L^dfjjfp) “ a A 2 f n(p) C 3 d0 3 + u l dp z ' a 2^3 p z de 3 
dv 2 ■ w 1 df 11 (p) - a 2 5 3 r U^ dfl 3 * “2 dP z + a 2 C 3 p 3 de 3 
de 2 * d0 2 3 - ^63 

NC 4 d(NS 4 ) - NS 4 d(NC 4 ) 

d0 A ■ n n 

4 <ns 4 ) z ♦ (nc 4 ) z 

NC 4 * C 23 D 41 - S 23 a z 
°41 * c l a x * s l a y 

d(NS 4 ) • -C 1 a x da 1 - S 1 a y de l ♦ Cjda z - Sjda x 
ns 4 * * s l a x * c l a y 

d(NC 4 ) * -S 2 30 41 d9 2 3 ♦ C 23 dD 41 ' ~ C 23 a z de 23 ' S 23 da z 

dD 41 * * S l a x d9 l + C l da x * C l a y d9 l + S l da z 
dflc 3 C^dS^ - S^dC^ 

dS 5 • -S 4 NC 4 de 4 ♦ C 4 d(NC 4 ) ♦ C 4 d9 4 NS 4 ♦ S 4 d(NS 4 ) 
dC 5 * C 2 3d9 23 D 41 ♦ s 23 d0 41 * S 23 a Z d9 23 + C 23 da Z 

s 5 ■ c 4 nc 4 ♦ s 4 ns 4 

C 5 * S 23 3 41 + C 23 a Z 
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“ ^6 d *6 " *6*^6 
S 6 " ‘ C 5*61 ' S 5*612 
Cj ■ -S« n 6H * C 4 M 61 12 

dS 6 " S 5*61 d# 5 * C 5 d *61 ' C 5*612 d, S * S 5 dW 612 
dC 6 • -dS 4 N 6n - S 4 «« 6ll ♦ <tf 4 N 6U2 ♦ C 4 dN 6112 

* - C 4*6U d *4 ' s 4 d *6U + * S 4*6112 d# 4 * C 4 d *6112 
*61 * c 4*611 * S 4*6U2 
*611 ‘ C 23*6111 * S 23°z 
*6112 ’ * s l°x * C l°y 

dN gl - -S 4 N 611 de 4 ♦ C 4 dN 6U + C 4 N 6n2 d9 4 ♦ S 4 dN 612 

d *6U * " S 23 N 6111 d9 23 * C 23 d *6111 ' C 23°z de 23 * ^^z 
d *6112 ‘ * C l°x d9 l - S l d0 x * S l°/ 9 1 * c i d0 y 
*6111 * c l°x * S l°y 

d *6111 * *W 9 1 * c l d0 x * C l°y d9 l + s i d0 y 
*612 * * S 23*6111 * C 23°z 

d *612 * " C 23*6111 d9 23 * S 23 d *llll + S 23°z de 23 * C 23 d0 z 
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Figure 1. Inner product step processor 
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